HADOOP-19282. S3A: STSClientFactory: do not use URIBuilder#7966
Conversation
URIBuilder was used from the AWS SDK for Java v2, to be precise from the shaded Apache HTTP Client. It is a problem if a user would like not to use the AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & sts), but that may cause problems on unshaded dependency versions. Since a URI constructor can achieve the same here I switched it as a preferred option.
|
💔 -1 overall
This message was automatically generated. |
|
Oh hey, way better CI luck this time around :) |
|
@LDVSOFT thanks for this -we never want to make those shaded libraries mandatory. If you can sort out the classpath of a leaner deployment -well done! Those test failures showed you were running them...the prefetch one fails regularly for me too. The PR Is merged to trunk. If you do a cherrypick PR and retest against branch-3.4 I will merge there too |
|
also, have you an asf JIRA ID to assign the fix to you there? If not: request one and tell me what it is. |
|
@steveloughran You're welcome! About leaner classpath: there are pros and cons. S3A, as far as I see, really only needs AWS SDK modules for s3, s3-manager, sts and whatever those would pull (core, sync/async clients, etc), however since dependencies of those aren't shaded, like in bundle, it may cause other problems I can't recommend to anyone outside as of now (as of now I don't trust Maven to handle this complexity 🤫). Also there is some noise as it seems implementation would prefer a particular good async http client, but don't quote me on that… Will cherry-pick for 3.4. I'm not familiar with Jira to understand what ID you mean here, but my username is LDVSoft. |
URIBuilder was used from the AWS SDK for Java v2, from the shaded Apache HTTP Client. It is a problem if a user would like not to use the AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & sts), but that may cause problems on unshaded dependency versions. Since a URI constructor can achieve the same here I switched it as a preferred option. Contributed by Lapshin Dmitry
|
@LDVSOFT the reason we use the big jar is just that historically it's been so brittle to things like the version of jackson, httpclient and more -so using the unshaded artifacts would put the SDK in charge of defining the versions of those artifacts for everything. regarding JIRA ID, your username is what I meant -you are now listed as the author of the patch there. |
URIBuilder was used from the AWS SDK for Java v2, from the shaded Apache HTTP Client. It is a problem if a user would like not to use the AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & sts), but that may cause problems on unshaded dependency versions. Since a URI constructor can achieve the same here I switched it as a preferred option. Contributed by Lapshin Dmitry
Description of PR
URIBuilderwas used from the AWS SDK for Java v2, to be precise from the shaded Apache HTTP Client. It is a problem if a user would like not to use the AWS SDK bundle, since more or less only 3 modules are needed (s3, s3-transfer & sts), but that may cause problems on unshaded dependency versions. Since a URI constructor can achieve the same here I switched it as a preferred option.How was this patch tested?
I've run the test suite against a eu-west-1 bucket, without scaling/load since the change shouldn't affect that. To be exact, with something like this:
auth-keys.xmlAlmost all test pass:
ITestDelegatedMRJobwork. They probably clean out environment somewhere and my environment-provided AWS credentials didn't work. Also it looks parametrized, and I can't tell from the Surefire/Failsafe reports which causes a problem.ITestRoleDelegationInFilesystem/ITestSessionDelegationInFilesystemfail a bit inmissmatch2, but I'm really unfamiliar with credentials delegation. Probably lost environment variables on it's way?ITestS3APrefetchingInputStreamfails with 0 size.Given that other tests pass and the scope of the change I think it's fine, and the problem is my test setup misconfiguration. If you know how to fix the setup — I can rerun with some other options.
Also, I've found this bug while repackaging Spark for a local K8S deployment, and with this fix STS configuration options work even if I do replace AWS SDK bundle with only required SDK modules.
For code changes:
LICENSE,LICENSE-binary,NOTICE-binaryfiles?Sign-off
I give a license to the Apache Software Foundation to use this code, as required under §5 of the Apache License.
P.S.
Re-opened from #7483 where there was a review by @steveloughran.